WATER RESOURCES RESEARCH, VOL. 43, W08403, DOL10.1029/2006WR005525, 2007 



Usefulness of the Reversible Jump Markov Chain Monte Carlo 
Model in Regional Flood Frequency Analysis 

M. Ribatet, 1,2 E. Sauquet, 1 JM. Gresillon, 1 and T. B.M.J. Ouarda 2 



OO 
O 
O 
(N 

X> 
PL, 



Abstract. Regional flood frequency analysis is a convenient way to reduce estimation 
uncertainty when few data are available at the gauging site. In this work, a model that 
allows a non null probability to a regional fixed shape parameter is presented. This method- 
ology is integrated within a Bayesian framework and uses reversible jump techniques. 
The performance on stochastic data of this new estimator is compared to two other mod- 
els: a conventional Bayesian analysis and the index flood approach. Results show that 
the proposed estimator is absolutely suited to regional estimation when only a few data 
is available at the target site. Moreover, unlike the index flood estimator, target site in- 
dex flood error estimation seems to have less impact on Bayesian estimators. Some sug- 
gestions about configurations of the pooling groups are also presented to increase the 
performance of each estimator. Keywords: Regional Frequency Analysis, Extreme Value 

Theory, Generalized Pareto Distribution, Reversible Jumps, Markov Chain Monte Carlo. 
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1. Introduction 

Extreme value theory is now widely applied when mod- 
eling block maxima or exceedences over a threshold is of 
interest. In particular, the Generalized Pareto Distribution 
(GPD) describes the limiting distribution of normalized ex- 
cesses of a threshold as the threshold approaches the end- 
point of the variable [Pickands, 1975]. The GPD has a dis- 
tribution function denned by: 



G(x;n,cr,€) = 1- 



1 + 



$(x - m) 



-vc 



, x > n, 1+ 



£0 - m) 



> 



(1) 

where a > 0, £ £ R. fj,, a and £ are respectively the location, 
scale and shape parameters. 

Thus, when extreme values must be estimated, this ap- 
proximation is frequently used. Most applications based on 
this result are related to environmental sciences, as extreme 
wind speed [Payer and Kuchenhoff , 2004], extreme sea level 
[Bortot and Coles, 2000; Pandey et at, 2004] or extreme 
river discharge [Northrop, 2004]. 

However, one must often deal with small samples and 
large uncertainties on estimation. Several publications point 
out the problem of the shape parameter estimation. This 
parameter is of great interest as it determines the tail be- 
haviour of the distribution. Therefore, many authors ana- 
lyzed the performance of particular estimators given a spec- 
ified range for the shape parameter: Rosbjerg et al. [1992] 
for the method of moments; Coles and Dixon [1999] for 
the maximum likelihood; Hosking and Wallis [1987] for the 
probability weighted moments; Juarez and Schucany [2004] 
for the minimum density power divergence estimator; Mar- 
tins and Stedinger [2000] for a proposed generalized maxi- 
mum likelihood. However, these results provide the most ac- 
curate estimator given the shape parameter; which is never 
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the case in practice. Therefore, Park [2005] introduced a 
systematic way of selecting hyper-parameters for his pro- 
posed generalized maximum likelihood estimator. 

All these approaches only deal with information from the 
target site sample. However, it is frequent in hydrology to 
perform a Regional Frequency Analysis (RFA) . Traditional 
RFA consists of two steps: (a) delineation of homogeneous 
regions i.e. a pooling group of stations with similar be- 
haviour; (b) regional estimation i.e. estimate target site 
distribution from the regional information. 

More recently, Bayesian approaches have been applied 
with success to incorporate regional information in fre- 
quency analysis [Coles and Tawn, 1996; Northrop, 2004; Sei- 
dou et al., 2006; Ribatet et al, 2007]. Empirical Bayesian 
estimators have also been proposed [Kuczera, 1982; Madsen 
and Rosbjerg, 1997]. One of the advantages of these ap- 
proaches is to distinguish the at site information from the 
other sites data in the estimation procedure. This is an im- 
portant point as, no matter how high the homogeneity level 
may be, the only data which represents perfectly the tar- 
get site is obviously the target site one. Thus, the whole 
information available is used more efficiently. In addition, 
according to Ribatet et al. [2007], the Bayesian approaches 
allow to relax the scale invariance property required by the 
most applied RFA model, that is, the index flood [Dalrym- 
ple, I960]. 

However, a preliminary study on simulated data showed 
that the approach developed by Ribatet et al. [2007] may 
lead to unreliable estimates for larger return periods (T > 20 
years) when small samples are involved. This poor perfor- 
mance is mainly due to the large variance on the shape pa- 
rameter estimation. Consequently, for such cases, attention 
must be paid to the regional estimation procedure for the 
shape parameter. 

The basis of our new development was formerly proposed 
by Stephenson and Tawn [2004] . They use reversible jump 
Markov chain Monte Carlo techniques [Green, 1995] to at- 
tribute a non null probability to the Gumbel case. There- 
fore, realizations are not supposed to be Gumbel distributed, 
but have a non null probability to be Gumbel distributed. 
An application to extreme rainfall and sea-level is given. In 
this work, this approach is extended to take into account a 
regional shape parameter, not only the Gumbel/Exponential 
case, within a RFA framework. The reversible jump tech- 
nique allows to focus on a "likely" shape parameter value 
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given by the hydrological relevance of the homogeneous re- 
gion. Thus, this approach may reduce the shape parameter 
variance estimation while relaxing the scale invariance prop- 
erty. 

The main objectives of this article is first to present new 
developments in the methodology proposed by Stephenson 
and Tawn [2004] required for a RFA context; second to as- 
sess the quality of two Bayesian models based on the index 
flood hypothesis: the regional Bayesian model proposed by 
Ribatet et al. [2007] (BAY) and the new proposed Bayesian 
approach applying reversible jumps Markow chains (REV). 
They are compared to the classical index flood approach 
of Dalrymple [1960] (IFL). The assessment is developed 
through a stochastic generation of regional data performed 
in order to obtain realistic features of homogeneous regions. 
Detailing the index flood concept is out of the scope of this 
article. Estimation procedure can be found in Hosking and 
Wallis [1997]. 

The paper is organized as follows. The next two sections 
concentrate on methodological aspects. Section 2 describes 
the Bayesian framework including the specific Markov Chain 
Monte Carlo (MCMC) algorithm, required to extend the 
work by Stephenson and Tawn [2004]. Section 3 presents 
the simple and efficient algorithm to generate stochastically 
hydrological homogeneous regions. A sensitivity analysis 
is performed in section 4 to assess how quantile estimates 
and related uncertainties are influenced by the values of two 
parameters of the reversible jump Markov chains. Section 5 
compares the performance of each estimator on six represen- 
tative case studies. The impact of the bias in the target site 
index flood estimation is analyzed in section 6, while sug- 
gestions for building efficient pooling groups are presented 
in section 7. Finally, some conclusions are drawn in section 



2. Methodology 

In the Bayesian framework, the posterior distribution of 
parameters must be known to derive quantile estimates. The 
posterior distribution tt(9\x) is given by the Bayes Theo- 
rem [Bayes, 1763]: 



tt (0) tt (x; 0) 



cx n (0) tt (x; 0) 



(2) 



V ' ' Jq 7T (0) 7T (x\ 0) d0 

where is the vector of parameters of the distribution to 
be fitted, O is the parameter space, n (x; 0) is the likelihood 
function, x is the vector of observations and n (0) is the prior 
distribution. 

In this study, as excesses over a high threshold are of in- 
terest, the likelihood function n(x; 0) is related to the GPD 
- see equation (1). 

2.1. Prior Distribution 

In this section, the methodology to elicit the prior dis- 
tribution is presented. In this study, regional information 
is used to define the prior distribution. Furthermore, the 
prior is specific as it must account for a fixed shape pa- 
rameter £pi x with a non null probability p^. Let 0o be a 
sub-space of the parameter space O of 0. More precisely, 
Oo = {0 £ © : £, = ?Fix}- P{ is a hyper-parameter of the 
prior distribution. The approach is to construct a suitable 
prior distribution on O; then, for p^ fixed, to modify this 
prior to account for the probability of O . 

For clarity purposes, the prior distribution is defined in 
two steps. First, an initial prior distribution 7Ti n (0) defined 
on O is introduced. Second, a revised prior distribution 7r(0) 
is derived from 7Ti n (0) to attribute a non null probability to 
the 8o sub-sample. 
2.1.1. Initial prior distribution 



As the proposed model is fully parametric, the initial 
prior distribution 7Ti n (0) is a multivariate distribution en- 
tirely defined by its hyper-parameters. In our case study, 
the initial prior distribution corresponds to the one intro- 
duced by Ribatet et al. [2007]. Consequently, the marginal 
prior distributions were supposed to be independent lognor- 
mal for both location and scale parameters and normal for 
the shape parameter. Thus, 



7r in (0) oc Jexp [(0' - 7 ) T £~ 1 (0' - 7)] 



(3) 

where 7, E are hyper-parameters, 0' = (log n, log a, £) and J 
is the Jacobian of the transformation from 0' to 0, namely 
J = I/110. 7 = (71,72,73) is the mean vector, E is the 
covariance matrix. As marginal priors are supposed to be 
independent, E is a 3 x 3 diagonal matrix with diagonal 
elements di , (fe , ^3 . 

Hyper-parameters are defined through the index flood 
concept, that is, all distributions are identical up to an at- 
site dependent constant. Consider all sites of a region except 
the target site - say the j-th site. A set of pseudo target site 
parameters can be computed: 



6 = & 



(i) 



(4) 
(5) 
(6) 



for i 7^ j, where is the target site index flood and 

tr* , are respectively the location, scale and shape 
at-site parameter estimates from the rescaled sample - e.g. 
normalized by its respective index flood estimate. Under the 
hypothesis of the index flood concept, pseudo-parameters 
are expected to be distributed as parameters of the target 
site. 

Information from the target site sample can not be used 
to elicit the prior distribution. Thus, in equations (4) 
and (5) must be estimated without use of the j'-th sample 
site. 

In this case study, is estimated through a General- 
ized Linear Model (GLM) defined by: 

fE[logC (j >] =v, v = Xf3 

\Var [logC* (j) ] =0V» 

where X are basin characteristics (possibly log trans- 
formed), <f> is the dispersion parameter, V the variance func- 
tion and v is the linear predictor. McCullagh and Nelder 
[1989] give a comprehensive introduction to GLM. Other 
alternatives for modeling the target site index flood can be 
considered such as Generalized Additive Models [Wood and 
Augustin, 2002], Neural Networks [Shu and Burn, 2004] or 
Kriging [Merz and Bloschl, 2005]. However, the variance 
of should be estimated. Indeed, as is estimated 
without use of the target site data, uncertainties due to this 
estimation must be incorporated in the prior distribution. 

From these pseudo parameters, hyper-parameters can be 
computed: 



(7) 
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Under the independence assumption between C^' and 
, the following relations hold: 



Var 


[logA W ] 


= Var [logC (j) l 
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(11) 
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"log<x (i) ] 


= Var [logC (j) ] 
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"log 4°] 


(12) 



The independence assumption is not too restrictive as the 

target site index flood is estimated independently from 

,d) Ai) 



' are estimated thanks to Fisher 
information and the delta method. Estimation of 



Note that Var [k>g-i i; 

.. . ... . the c^ t 

Var logC (i) is a special case and depends on the method 
for estimating the at-site index flood. Nevertheless, it is al- 
ways possible to carry out an estimation of this variance, at 
least through standard errors. 
2.1.2. Revised prior distribution 

The initial prior distribution ni n (9) gives a null probabil- 
ity to the sub-sample Oo- Thus, from this initial prior 7n n (0), 
a revised prior n(9) is constructed to attribute a non null 
probability to the 6o sub-sample. According to Stephenson 
and Tawn [2004], ir(0) is defined as : 



n(9) = 



(l-Pe)jTin(fl) 



for e e e\e 

for 9 e Go 



where p^ € [0, 1] and with 



Tin(//,0-, £fIx) 



(13) 



(14) 



for (9 £ Go- The integral in equation (14) can be easily 
evaluated by standard numerical integration methods. 

By construction, the new prior distribution tt(9) gives the 
required probability to the sub-space Go- Stephenson and 
Tawn [2004] have already applied formulations (13) and (14) 
with success for sea-level maxima and rainfall threshold ex- 
ceedences. 

2.2. Posterior Estimation 

As it is often the case in Bayesian analysis, the integral 
in equation (2) is insolvable analytically. MCMC techniques 
are used to overcome this problem. Yet, due to the duality of 
tt(9) distribution, standard Metropolis-Hastings [Hastings, 
1970] within Gibbs [Geman and Geman, 1984] methods are 
not sufficient. Reversible jump techniques [Green, 1995] are 
used to allow moves from the two dimensional space Go to 
the three dimensional space G\Go and vice-versa. 

The classical Bayesian analysis, on G\Go, is performed 
with Gibbs cycle over each component of 9 using Metropolis- 
Hastings updates, with random walk proposals [Coles and 
Tawn, 1996]. 

Stephenson and Tawn [2004] extended this algorithm 
to incorporate the mass on the Gumbel/Exponential case. 
However, as our approach does not only focus on the £fi x = 
case, a new algorithm must be implemented. To help under- 
stand the algorithmic developments, some details about the 
classical Metropolis-Hastings algorithm and the reversible 
jump case are reported in Appendix A. 

The proposed algorithm must deal with two dimensional 
changes: a change to Go from G\Go space and vice-versa. 
These two types of special moves must be defined cautiously. 
As inspired by Stephenson and Tawn [2004], quantiles asso- 
ciated to a non exceedence probability p are set to be equal 
at current state 9 t and proposal 9 plop , p being fixed. 



For a proposal move to G\Go from Go, i.e., £ t = £fi x 
and a proposal shape £ P ro P / ^Fix, the candidate move is to 



change 9 t = (/U t ,<7 t ,£ t ) to 9 P 



(p. 



prop j u prop j <^prop 



/^prop — /^t 



Cprop (y 



'&(ir«p~p -i) 

e Pr o P ~AA(|, s |) 



where 

(15a) 
(15b) 
(15c) 



where y = 1 — p, p being fixed, £ is taken to be the mode of 
the marginal distribution for £ when there is no mass on Go 
[Stephenson and Tawn, 2004], and is the standard devi- 
ation selected to give good mixing properties to the chain. 
As it is usually the case with Metropolis-Hastings updates, 
this move is accepted with probability min(l, A) with 



A = 



71"(/^prop, 0~ prop, £prop [%) 



n(nt,o~t, £Fix|a;) 



prop 

1 - p£ L J 



(16) 

where (/>(■; m,s^) denotes the density function of the Nor- 
mal distribution with mean m and variance s 2 , and J^ Fix is 
the Jacobian of the parameter transformation for quantile 
matching, that is: 



•WO 



CFix V 



1 



If the move is accepted, then Qt+i — (/^prop, Cprop, £prop), 



(17) 
else 



7t+l 



For a proposal move to Go from G\Go, i.e., £t 7^ ^Fix 
and a proposal shape £ P ro P = (.Fix, the proposal is to change 
9 t = (nt,0t,£t) to 6» P rop = (/iprop,0-pro P ,£prop) where 



/^prop — f^t 



Of 



£pr°p(2/ 



?Fix 



1) 



(18a) 
(18b) 
(18c) 



This move is accepted with probability min(l, A) where 



^ (/^prop j <^prop j ^Fix | £') 1 



n(nt,at,(t[x) 



Pi 



(a* 



prop j v prop j 



t (6) (19) 



£prop) else 



If the move is accepted, then 9 t +i 
9t+i = 9 t . 

Obviously, special moves introduced in this study are not 
the only conceivable ones. Other reversible jumps can be ex- 
plored - see for example Stephenson and Tawn [2004] . How- 
ever, for this application, the proposed moves seem to be 
particularly well suited. Indeed, a preliminary study shows 
that the location parameter was well estimated by a regional 
Bayesian approach. Thus, a special move which only affects 
the shape and scale parameters should be consistent. 

3. Generation Procedure 

In this section, the procedure implemented to generate 
stochastic homogeneous regions is described. The idea con- 
sists in generating sample points in a neighborhood of the 
L-moment space (Mean, L-CV, L-Skewness). The genera- 
tion procedure can be summarized as follows: 

1. Set the center of the neighborhood i.e. (Ii,r, tr, T3,r) 
or equivalently parameters of the regional distribution 
(At«,°"fl,£ft); 
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Figure 1. Histogram of the coefficient of determination 
for the regressive model (7). Application of section 5. 



2. Generate N points (h,i, n,T3 z i) uniformly in the sphere 
B((h,R,TR,T3,R);ey : 

3. Generate N index floods C using the scaling model 
parametrization: 

C = aArea? (20) 

Catchment areas are defined as realizations of a lognormal 
random variable. 

4. For each (h,i, t», T3,i), compute adimensional parame- 
ters by: 



3T3 . 



f* = 

* 1+T3,, 

ff? = (€;-i)«;-2)ii, i T i 
* , c * 



5. Then, compute at-site parameters from: 

& = e; 

CTj = C;cr* 

Hi = Cm* 



(21a) 
(21b) 
(21c) 



(22a) 
(22b) 
(22c) 



6. Simulate samples from a GPD with parameters 

As a GLM is used to elicit the prior distribution, the scal- 
ing model (20) must be altered to avoid giving an advantage 
to the Bayesian approaches over the index flood model. For 
this purpose, a noise in relation (20) at step 3 is introduced. 
Thus, areas are altered by adding uniform random variables 
varying in (—0.5 x Area, 0.5 x Area). 

This distortion is necessary to ensure that the regressive 
model is not too competitive and is consistent with observa- 
tions. Indeed large deviations to the area-index flood rela- 
tionship are often encountered in practice. In the following 
applications, a = 0.12, (3 = 1.01 and Area ~ £A/"(4.8, 1). 
These values arise from a previous study on a French data 
set [Ribatet et al, 2007] and ensure realistic magnitudes. For 
the application of section 5, the coefficients of determination 
for the regressive model (7) varies from 0.20 to 0.99, with 
a mean value of 0.89. The histogram of these coefficients 
of determination is presented in Figure 1. The radius e in 
the generation algorithm is set to 0.04. This value is chosen 
to reflect variability met in practice while preserving a low 



dispersion around the regional distribution. The e value pri- 
marily impacts the proportions of regions satisfying Hi < 1. 
For specific applications, regions with a heterogeneity statis- 
tic Hi such as Hi > 1 may be discarded. 

4. Sensitivity Analysis 

In this section, a sensitivity analysis for the algorithm in- 
troduced in section 2.2 is carried out. The primary goal is 
to check if results are not too impacted by the choice of the 
two user-selectable parameters and ^Fix- For this pur- 
pose, the effect of both p^ and ^Fix values on estimates and 
credibility intervals is examined. For this sensitivity analy- 
sis, the parameters of the regional distribution is set to be 
(0.64, 0.48, 0.26). The regions have 20 sites with a sample 
size of 70. For the whole sensitivity analysis, 10 000 regions 
were generated. The target site has a sample size of 10. 
We concentrate on estimates at sites with very few data, to 
exhibit the main differences in the most restricting config- 
uration. Other configurations were found to demonstrate 
features similar to Fig. 3 and Fig. 2. 

4.1. Effect of p € 

The evolution of the normalized biases (expressed in per- 
cent) for return levels with non exceedence probabilities 
0.75, 0.95 and 0.995 associated to several p^ values are de- 
picted in Fig. 3. Each boxplot is obtained from at-site esti- 
mates computed on more than 365 stochastic homogeneous 
regions. The case = corresponds to a classical Bayesian 
approach free from any point mass. In addition, to analyze 
only the effect of the parameter £fi x is temporarily fixed 
to be equal to the theoretical regional shape parameter. 



Table 1. Posterior proportions (in percent) of events 
{fl 6 60} for different values of and £pi x . Target Sample 
Size 60. 
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Figure 2. Posterior marginal density for the shape pa- 
rameter. 
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Figure 3. Effect of p^ value on quantile estimation with non exceedence probabilities 0.75, 0.95 and 
0.995. Sample size 10. £ Pi x = 0.26. 
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Figure 4. Effect of p^ value on 90% posterior credibility interval. Sample Size 10. 



From Fig. 3, the quantile estimates distribution seems 
to be stationary, provided that p^ > 0. Introducing a point 
mass does not impact Qo.75 estimates, whereas significant 
reduction in median biases and scatter of estimates is no- 
ticeable for more extremal quantiles. 

Fig. 4 shows the posterior distributions of return levels 
and 90% posterior credibility intervals for several values. 

It is clear that credibility intervals are sensitive to the 
value. This result is consistent as more and more propos- 
als in the MCMC simulation belong to Oo as p^ increases. 



Thus, by construction, the Markov chain is less variable. As 
denoted by Stephenson and Tawn [2004], the special case 
= 1 is particular as uncertainty in the shape parameter 
is not considered. In that case, credibility intervals could be 
falsely narrow. 

4.2. Effect of £ Fix 

It is important to analyze the influence of the choice of 
£fi x on the simulated Markov chains; and thus, its impact on 
estimations. Indeed, when specifying an unreasonable £fix 
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value, the estimations must not differ significantly from the 
conventional Bayesian ones. For this purpose, Tab. 1 dis- 
plays the posterior proportions of events {6 £ 0o} for several 
£fi x and p^ values. This table is obtained with a target site 
sample size of 60. For each specified ^Pi x value, two features 
are computed to measure the relevance of the ^Pi x value: (a) 
-Rshape the ratio of £fi x to the true shape parameter ; and 
(b) Dshape the ratio of the marginal posterior density from 
a conventional Bayesian analysis evaluated in £ Pix and f . 

-Rshape characterizes how much the point Mass differs 
from the true value. Dshapc quantifies the distance of ^Fi x 
from the estimator of the shape parameter proposed by Ri- 
batet et al. [2007]. Thus, from these two statistics, con- 
sistency of the posterior proportions with deviations from 
theoretical and empirical values can be analyzed. 

The results in Tab. 1 show that values of ^Fi x that are 
not consistent with the data imply low proportions of state 
in Go. Thus, for such values, the proposed model is quite 
similar to a conventional Bayesian analysis. However, for 
two different values of ^Fi x (ifehapc equal to 0.83 and 1), the 
posterior proportions are quite equivalent. This emphasizes 
the large uncertainty on the shape parameter estimation for 
small sample sizes. Uncertainty on the shape parameter 
estimation is also corroborated by the posterior marginal 
distribution of a conventional Bayesian analysis - sec Fig. 2. 

As noticed above, these results are obtained with a tar- 
get site sample size of 60. This particular sample size was 
selected as it is the most illustrative case. However, the pos- 
terior proportions are quite similar when dealing with other 
target site sample sizes - even if for very small sample sizes, 
this is less noticeable. 



5. Simulation Study 

In this section, performance of three different estimators 
are analyzed: a conventional Bayesian estimator (BAY) in- 
troduced by Ribatet et al. [2007], the proposed estimator 
based on reversible jumps (REV) and the index flood esti- 
mator (IFL). In particular, the BAY estimator is related to 
the initial prior distribution defined in Section 2.1.1. Thus, 
the BAY estimator is identical to the REV approach with 

Pe = o. 

For the proposed estimator, the point Mass probability 
P£ was set to be a function of the Hi statistic of Hosking 
and Wallis [1997]; that is: 

_ exp(-iJi) 
P « - l + expt-tfO (23) 

For this parametrization, necessary requirements are sat- 
isfied; i.e. p£ — > when Hi — » +oo and p% — > 1 when 
Hi — ► — oo. Moreover, for Hi — 0, = 0.5 which corre- 
sponds to the estimator introduced by Stephenson and Tawn 
[2004] . Note that p^ in Eq. (23) is defined with the negative 
inverse of the so called logit function. 

Thus, for this choice, as underlined by the sensitivity 
analysis, credibility intervals are related to the degree of 



Table 2. Characteristics of the sixth case studies. The tar- 
get site is omitted in the couple (ng; tc , ngj zc ) and has a sample 
size of: 10, 25 and 40. 







N slte 


(nSito.«Sizc) 


A^Evcnts 


Confl 


(0.64, 0.48, 0.26) 


10 


(9,50) 


450 


Conf2 


(0.64, 0.48, 0.26) 


20 


(9,30) x (10, 18) 


450 


Conf3 


(0.64, 0.48, 0.26) 


15 


(14, 50) 


700 


Conf4 


(0.66, 0.48, 0.08) 


10 


(9, 50) 


450 


Confo 


(0.66, 0.48, 0.08) 


20 


(9,30) x (10, 18) 


450 


Confo 


(0.66, 0.48, 0.08) 


15 


(14, 50) 


700 



confidence of the point Mass £fi x to be the true shape pa- 
rameter and implicitly to the level of homogeneity of the 
regions. 

In addition, the non exceedence probability p used for 
quantiles matching in our algorithm (see Section 2.2) is equal 
to 1 — l/2n, where n is the target site sample size. This last 
point guarantees that quantiles associated with non excee- 
dence probability 1 — l/2n for both proposal and current 
state of the Markov chain are identical. Other choices for p 
are arguable. Here, we introduce a quantile matching equa- 
tion for a value closely related to the scale parameter and 
for which uncertainties are not too large. 

The analysis was performed on six different case studies 
summarized in Tab. 2. The configurations differ by the 
way information is distributed in space; that is, (a) "small 
regions" with well instrumented but few sites (Confl and 
Con/4); (b) "large regions" with less instrumented and nu- 
merous sites (Con/2 and Con/5) and (c) "medium regions" 
with well instrumented sites and an intermediate number of 
gauging stations. Confl (resp. Con/2, Con/3) correspond 
to Con/4 (resp. Con/5, Con/6) apart from the ((j,r, o~n, £r) 
values. The target site sample size takes the values in 10, 25 
and 40. 1000 regions were generated for each configuration. 
Markov chains of length 15 000 were generated. To ensure 
good mixing properties for all simulated Markov chains, an 
automated trial and error process was used to define pro- 
posal standard deviations of the MCMC algorithm. Fur- 
thermore, the first 2000 iterations were discarded to ensure 
that the equilibrium was reached. 

The performance of each estimator is assessed through 
the three following statistics: 



NBIAS 



SD 



NMSE 



1 \—y Qi — Qi 



\l 



1 k 



Qz-Q 



— NBIAS] 

Qi J 



1 

k^ 

1—1 



Qi Qi 
Qi 



(24) 
(25) 
(26) 



where Qi is the estimate of the theoretical value Qi and k 
is the total number of theoretical values. 

5.1. BAY vs. IFL Approach 

Table 3 shows that, for a small target site sample size and 
quantiles Qo.75 and Qo.95, the BAY approach is more com- 
petitive than the IFL one. Indeed, the three BAY statistics 
(NBIAS, SD, NMSE) are smaller than the ones related to 
IFL. However, for Con/2 and Con/5, IFL Q0.95 estimates 
are more competitive. These two case studies correspond 
to the same configuration - i.e. numerous sites with short 
records. IFL estimates for Q0.995 are always more accurate 
than BAY for all configurations. 

These results indicate that the relative performance of 
BAY compared to IFL depends on the pooling group. 
Thus, for the BAY approach and quantiles Q0.75 and Q0.95, 
it seems preferable to work with less gauging stations but 
which have larger data series, independently of the target 
site sample size. The sensitivity to the configuration of the 
sites and the availability of long time series is a drawback 
for the application of this Bayesian approach. 

These conclusions obtained on stochastic regions are in 
line with a previous analysis on a French data set [Ribatet 
et al., 2007]. The BAY approach is suited to work with 
"small" or "medium" regions and well instrumented gaug- 
ing stations. In addition, this approach is accurate for "rea- 
sonable" quantile estimation - see the bad performance of 
BAY for Qo. 995 in table 3. 
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Table 3. Performance of BAY and IFL estimators for quantile Qo.75iQo.95 and Q0.995. Target site sample size: 10. 



Qo.75 Qo.95 Qo.995 
CTT MhJiiJ? M Ut T A t> CT5 \! M ETC Kf 15 1 A U 





NB1AS 




sD 


NMSE 


NB1AS SD 


NMSE 


N BIAS 




SD 


NMSE 












Confl 












BAY 


0.015 





123 


0.015 


0.001 0.187 


0.035 


-0.006 


0. 


318 


0.101 


IFL 


0.037 





189 


0.037 


0.025 0.195 


0.038 


-0.004 


0. 


.230 


0.053 












Conf2 












BAY 


0.019 





122 


0.015 


0.030 0.249 


0.063 


0.110 





.561 


0.326 


IFL 


0.041 


0. 


.183 


0.035 


0.025 0.191 


0.037 


-0.022 





221 


0.049 












Conf3 












BAY 


0.019 





110 


0.012 


0.006 0.174 


0.030 


-0.003 





.292 


0.085 


IFL 


0.035 





188 


0.037 


0.025 0.195 


0.039 


-0.002 





.222 


0.049 












Conf4 












BAY 


0.009 





104 


0.011 


-0.007 0.149 


0.022 


-0.021 





.233 


0.054 


IFL 


0.023 


0. 


.157 


0.025 


0.022 0.163 


0.027 


0.022 





.192 


0.037 












Conf5 












BAY 


0.018 





109 


0.012 


0.012 0.193 


0.037 


0.033 





.378 


0.144 


IFL 


0.036 





168 


0.029 


0.033 0.173 


0.031 


0.024 





.197 


0.039 












Conf6 












BAY 


0.024 





103 


0.011 


0.001 0.151 


0.023 


-0.038 





222 


0.050 


IFL 


0.028 





.168 


0.029 


0.028 0.177 


0.032 


0.028 





.202 


0.042 



Table 4. Performance of BAY and REV estimators for quantile Q0.75, Qo.95 an d Qo.995- Target site sample size: 10. 



Model ■ 




Q« 


1.75 




Qo.95 




C 


t?0.995 




NBIAS 




SD 


NMSE 


NBIAS SD 


NMSE 


NBIAS 


SD 


NMSE 












Confl 










BAY 


0.015 





.123 


0.015 


0.001 0.187 


0.035 


-0.006 


0.318 


0.101 


REV 


0.011 





119 


0.014 


-0.012 0.159 


0.026 


-0.046 


0.213 


0.047 












Conf2 










BAY 


0.019 





.122 


0.015 


0.030 0.249 


0.063 


0.110 


0.561 


0.326 


REV 


0.005 


0. 


105 


0.011 


-0.026 0.154 


0.024 


-0.066 


0.269 


0.077 












Conf3 










BAY 


0.019 





.110 


0.012 


0.006 0.174 


0.030 


-0.003 


0.292 


0.085 


REV 


0.014 





103 


0.011 


-0.008 0.139 


0.019 


-0.042 


0.185 


0.036 












Conf4 










BAY 


0.009 





.104 


0.011 


-0.007 0.149 


0.022 


-0.021 


0.233 


0.054 


REV 


0.010 





102 


0.011 


0.002 0.136 


0.018 


-0.001 


0.182 


0.033 












Conf5 










BAY 


0.018 





.109 


0.012 


0.012 0.193 


0.037 


0.033 


0.378 


0.144 


REV 


0.013 





097 


0.010 


0.000 0.126 


0.016 


-0.014 


0.171 


0.030 












Conf6 










BAY 


0.024 





.103 


0.011 


0.001 0.151 


0.023 


-0.038 


0.222 


0.050 


REV 


0.031 





099 


0.011 


0.033 0.133 


0.019 


0.034 


0.174 


0.032 



However, the white noise introduced in the generation 
procedure is independent of the target site sample size. It 
only regards both Bayesian approaches. Thus, the perfor- 
mance of the BAY estimator for large sample sizes may be 
too impacted. Indeed, while the IFL estimation procedure 
is not altered, both Bayesian approaches must deal with ar- 
tificially generated biases. 

The main idea for the REV approach is to combine the 
good performance of the BAY estimator for "reasonable" 
quantiles and the efficiency of the IFL approach for larger 
quant iles. 

5.2. BAY vs. REV Approach 

The comparison of the two Bayesian estimators is sum- 
marized in Tab. 4. REV leads to more accurate estimated 
quantiles, in particular for Qo.95 and Qo.995- This last point 
confirms the benefits of using a regional shape parameter 
through a reversible jump approach. 

By construction of the algorithm described in section 2.2, 
Markov chains generated from the REV approach are less 
variable than the ones generated from the BAY model. 
Thus, REV is associated to smaller standard deviation than 
BAY whatever the configuration is (Table 4). Moreover, 
if the regional fixed shape parameter £fi x is suited, REV 
should have the same biases than BAY. Thereby, the REV 
estimator always leads to a smaller NMSE. 



5.3. Global Comparison 

Figures 5 to 7 illustrate the results for different target site 
sample sizes and regions. We concentrate on the NMSE cri- 
teria since it measures variation of the estimator around the 
true parameter value. 

From Figure 5, it is clear that Bayesian estimations, i.e. 
BAY and REV, of Q0.75 are more accurate; specially for 
a target site sample size of 10. For larger target site sam- 
ple sizes, Bayesian approaches are always more competitive 
than the IFL estimator, even if this is less clear-cut on 
the graphs. Furthermore, BAY and REV estimators often 
have the same performance. This result is logical as the 
Q0.75 value is mostly impacted by the location parameter jj,. 
Thus, reversible jumps do not have a significant result on 
REV Q0.75 estimation. 

The plots in Figure 6 and those displayed in Figure 5 
are quite different. For a target site sample size of 10, both 
Bayesian approaches are the most accurate - except for BAY 
applied on Conf2 and Con/5 - and the REV estimator 
leads always to the smallest NMSE. Thus, REV is the 
most competitive model. For larger target site sample sizes, 
REV is at least as accurate as IFL, except for Con/2. 

For Qo.995 and a target site sample size of 10, REV is 
the most accurate model, except for Con/2. As the target 
site sample size increases, the IFL approach becomes more 
efficient. However, for these cases, NMSE for the REV es- 
timator are often close to the IFL ones. Although the BAY 
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Figure 5. Evolution of the NMSE for quantile Qo.75 in function of the region configuration. Target 
site sample size: (a) 10, (b) 25 and (c) 40. 
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from the efficiency of the BAY estimator for quantiles with 
small non exceedence probabilities while being as competi- 
tive as the IFL approach for larger non exceedence proba- 
bilities. 

However, the Bayesian approaches outperform the index 
flood model but differences in accuracy seem to be less and 
less significant as the sample site increases. This may be 
related to the white noise introduce in the generation proce- 
dure. Indeed, this white noise is independent of the target 
site sample size and may strongly penalize the performances 
of the both Bayesian approaches. The next section tries to 
outline the effect of the target site index flood estimation 
error to the quantile estimates. 

6. Effect of Bias on the Target Site Index 
Flood Estimation 

According to the model being considered, two types of 
biases are encountered for the target site index flood esti- 
mation. Indeed, on one hand, the index flood for the IFL 
model is derived from the target site sample. On the other 
hand, for BAY and REV approaches, the index flood is es- 
timated from a scaling model. Thus, biases on index flood 
estimation are due to the relevance of this scaling model but 
also to the index flood error estimation for the other sites 
within the region. 

To illustrates these two types of biases, the normalized 
bias on target site index flood estimation is computed as 
follows: 

Bias(C) = (27) 
O 

where C is the target site index flood, and C is an estimate 
of C. Figure 8 depicts changes in NBIAS for quantile Q0.95 
in function of Bias(C). As normalized biases are considered, 
statistics for the six configurations are plotted in the same 
graphic. Solid lines correspond to local polynomial regres- 
sion fits to help underline trends. 

Scatter-plots in Figure 8 show clearly these two types of 
biases. Indeed, on one hand, the range of Bias(C) is not 
the same for IFL than for BAY and REV, particularly for 
a target site sample size of 25 and 40. On the other hand, 
for the BAY and REV approaches, biases on index flood 
estimation are independent of the target site sample size; 
while this is not the case for IFL. This last point is also 
illustrated as the bias ranges for the Bayesian approaches 
remain the same for all target site sample size. Thus, for 
large sample size, efficiency of the Bayesian estimators may 
be too much impacted as the artificial bias introduced in the 
generation procedure is too penalizing. 

The Bayesian approaches do not have the same behaviour 
than the IFL model. In particular, BAY and REV seem 
to be less sensitive to a large bias in target site index flood 
estimation. NBIAS for the IFL model are clearly linear 
with a response y — x. This last point is an expected result. 
Indeed, apart from sampling variability, if a unique regional 
distribution exists, quantile IFL estimate biases are only in- 
duced by biases on target site index flood estimates. Thus, 
the relevance of the generation procedure is corroborated. 

The main difference between the BAY and REV estima- 
tors is the dispersion around local smoothers. Indeed, REV 
has a smaller range while preserving the same robustness to 
the bias on target site index flood estimation. 

These results and conclusions are independent of the tar- 
get site index flood estimation procedure. However, the per- 
formance of the two Bayesian estimators is related to the 
bias and variance of the target site index flood estimate. 
Thus, for similar variance, these results should be identical 
if GAMs or Kriging were used. 



7. Suggestions for Region Configuration 

This section attempts to present some suggestions for 
building suitable pooling groups according to the considered 
estimator. Hosking and Wallis [1997] already advice not to 
build regions greater than 20 sites because of the small gain 
affected with additional stations. However, they only focus 
on the IFL methodology. We attempt to do the same for 
the two Bayesian estimators considered in this study. For 
this purpose, tables 5, 6 and 7 include the NMSE and the 
related standard errors for each configuration and target site 
sample size. 

From Table 5, the IFL estimator seems to have the same 
performance level independently of the configuration. This 
result points out that the information is not used opti- 
mally as regions with the most information (i.e. Con/3 
and Conf6) do not always lead to better estimations. This 
last point corroborates a previous comments of Ribatet et al. 
[2007]. 

Table 6 shows that the BAY estimator is more accu- 
rate with "medium" regions, i.e. ConfS and Conf6. How- 
ever, results for "small" regions, i.e. Confl and Conf4, 
are often close to the best ones - especially for a light tail. 
Thus, it is preferable to work with well-instrumented sites, 
i.e. Confl, Conf3, Conf4 and ConfQ. 

Table 7 shows that the REV estimator more efficient with 
"medium" regions, i.e. ConfS and Con/6. In addition, it 
seems to be more accurate with few but well-instrumented 
gauging stations rather more but less-instrumented ones. 
Nevertheless for a light tail, all configurations seems to lead 
to similar performance levels. 

Tables 5, 6 and 7 show that the estimation of Q0.75 is 
independent of the region configuration for all estimators. 
Thus, it seems that the regional information is not relevant 
for quantiles with small non exceedence probabilities. 

8. Conclusions 

This article introduced a new Bayesian estimator which 
uses regional information in an innovative way. The pro- 
posed model accounts for a fixed regional shape parameter 
with a non null probability. Thus, as in Ribatet et al. [2007] , 
the regional information is still used to elicit the prior dis- 
tribution. However, the prior distribution is now a mixture 
of a GEV/GPD and a GEV/GPD with only two parameters 
- the remaining one corresponds to the fixed regional shape 
parameter. 

The estimation procedure is achieved using reversible 
jump Markov chains [Green, 1995]; and theoretical de- 
tails for simulated suited Markov chains were presented. 
A sensitivity analysis for the proposed algorithm was per- 
formed. The results showed that the estimates are consis- 
tent provided that the probability attributed to the fixed 
regional shape parameter is positive. In addition, as noticed 
by Stephenson and Tawn [2004] , the credibility intervals are 
sensitive to this probability value. Thus, the proposed esti- 
mator relates this probability value to the homogeneity de- 
gree of the region - using the heterogeneity statistic of Hosk- 
ing and WalUs [1997]. Therefore, the credibility intervals 
take into account the belief about the fixed regional shape 
parameter to be the true value. 

A performance analysis was carried out on stochastic data 
for three different estimators. For this purpose, another 
algorithm which generates stochastic homogeneous regions 
was implemented. The good overall performance of the pro- 
posed estimator has been demonstrated. Indeed, on one 
hand, this approach combines the accuracy of the regional 
Bayesian approach of Ribatet et al. [2007] for quantiles asso- 
ciated to small exceedence probabilities. On the other hand, 



X - 10 



RIBATET ET AL.: REVERSIBLE JUMP TECHNIQUES IN REGIONAL FLOOD FREQUENCY ANALYSIS 



Table 5. Changes in NMSE for Qo.75iQo.95 and Q0.995 in function of the region configuration and the target 
site sample size for the IFL estimator. Related standard errors are displayed in brackets. 



Model 



Confl 



Heavy Tail 
G'ora/2 



Con/3 



Con/4 



Light Tail 
Con/5 



Con/6 



Target site sample size 10 

Q0.75 0.037 (3e-3) 0.035 (3c-3) 0.037 (3c-3) 0.025 (2e-3) 0.029 (2c-3) 0.029(3e-3) 

Q0.95 0.038 (3e-3) 0.037 (3e-3) 0.039 (3c-3) 0.027 (4e-3) 0.031 (2e-3) 0.032 (3e-3) 

Qo.995 0.053 (4e-3) 0.049 (3e-3) 0.049 (4c-3) 0.037 (2c-3) 0.039 (3c-3) 0.042 (4e-3) 

Target site sample size 25 

Q0.75 0.014 (8e-4) 0.015 (lc-3) 0.015 (lc-3) 0.011 (7c-4) 0.011 (7c-4) 0.011(7c-4) 

Qo.95 0.018 (le-3) 0.018 (lc-3) 0.018 (lc-3) 0.014 (9c-4) 0.014 (9c-4) 0.013 (9e-4) 

Qo.995 0.034 (2e-3) 0.032 (2e-3) 0.027 (2c-3) 0.024 (2c-3) 0.023 (2c-3) 0.020 (le-3) 

Target site sample size 40 

Q0.75 0.010 (6e-4) 0.009 (6e-4) 0.010 (6e-4) 0.007 (4e-4) 0.007 (4e-4) 0.007 (5e-4) 

Q0.95 0.013 (8e-4) 0.013 (8e-4) 0.012 (8c-4) 0.010 (6c-4) 0.009 (5e-4) 0.010 (6e-4) 

Qo.995 0-028 (2c-3) 0.028 (2c-3) 0.023 (2c-3) 0.020 (le-3) 0.017 (lc-3) 0.019 (le-3) 



Table 6. Changes in NMSE for Qo. 75, Qo.95 and Qo.995 in function of the region configuration and the target 
site sample size for the BAY estimator. Related standard errors are displayed in brackets. 



Model 



Heavy Tail " " * "~ Light Tail 
Con/1 Con/2 Con/3 Con/4 Con/5 Con/6 

Target site sample size 10 

Q0.75 0.015 (lc-3) 0.015 (lc-3) 0.012 (8e-4) 0.011 (6c-4) 0.012 (9c-4) 0.011(9c-4) 

Qo.95 0.035 (2e-3) 0.063 (4e-3) 0.030 (2c-4) 0.022 (lc-3) 0.037 (3e-3) 0.023 (2e-3) 

Qo.995 0.101 (le-2) 0.326 (3e-2) 0.085 (6c-3) 0.054 (5c-3) 0.144 (lc-2) 0.050 (3e-3) 

Target site sample size 25 

Q0.75 0.010 (6e-4) 0.011 (7e-4) 0.009 (5e-4) 0.008 (5e-4) 0.007 (4c-4) 0.007(5e-4) 

Qo.95 0.026 (2e-3) 0.041 (3c-3) 0.025 (lc-3) 0.017 (lc-3) 0.023 (lc-3) 0.016 (9e-4) 

Qo.995 0.089 (8e-3) 0.212 (2c-2) 0.079 (4c-3) 0.044 (3c-3) 0.086 (6c-3) 0.038 (2e-3) 

Target site sample size 40 

Q0.75 0.008 (5e-4) 0.008 (5e-4) 0.007 (4c-4) 0.005 (3e-4) 0.005 (3e-4) 0.006 (4e-4) 

Qo.95 0.020 (le-3) 0.032 (2e-3) 0.020 (lc-3) 0.012 (8c-4) 0.015 (9e-4) 0.013 (8e-4) 

Qo.995 0-072 (5c-3) 0.187 (2c-2) 0.074 (5c-3) 0.038 (3c-3) 0.070 (6c-3) 0.036 (2e-3) 



Table 7. Changes in NMSE for Qo. 75, Qo.95 and Qo.995 in function of the region configuration and the target 
site sample size for the REV estimator. Related standard errors are displayed in brackets. 



Model 



Con/1 



Heavy Tail 
Con/2 



Con/3 



Con/4 



Light Tail 
Con/5 



Con/6 



Target site sample size 10 

0.011 (7c-4) 0.011 (7c-4) 0.011 (6c-4) 0.010 (7c-4) 0.011(9e-4) 

0.024 (2e-3) 0.019 (lc-3) 0.018 (lc-3) 0.016 (lc-3) 0.019 (2e-3) 

0.077 (2c-2) 0.036 (2c-3) 0.033 (2c-3) 0.030 (2c-3) 0.032 (3e-3) 

Target site sample size 25 

0.009 (6e-4) 0.008 (5e-4) 0.008 (5e-4) 0.006 (4e-4) 0.007(5e-4) 

0.020 (2e-3) 0.016 (9c-4) 0.014 (lc-3) 0.014 (9c-4) 0.014 (9e-4) 

0.061 (lc-2) 0.031 (2c-3) 0.026 (2c-3) 0.032 (3c-3) 0.024 (2e-3) 

Target site sample size 40 

0.007 (5c-4) 0.006 (4c-4) 0.005 (3e-4) 0.005 (3e-4) 0.006 (3e-4) 

0.016 (lc-3) 0.012 (9e-4) 0.010 (7c-4) 0.010 (5e-4) 0.011 (6c-4) 

0.055 (lc-2) 0.027 (2e-3) 0.022 (2c-3) 0.023 (2c-3) 0.021 (le-3) 



Q0.75 0.014 (le-3 

Qo.95 0.026 (2e-3 

Qo.995 0.047 (3c-3 

Q0.75 0.009 (6e-4 

Qo.95 0.019 (lc-3 

Qo.995 0.040 (3e-3 

Q0.75 0.008 (5c-4 

Qo.95 0.015 (lc-3 

Qo.995 0.034 (2e-3 



the duality of the prior distribution (and the fixed regional 
shape parameter) allows the proposed estimator to be at 
least as efficient as the index flood model. Thus, this new 
estimator seems very suited for regional estimation when the 
target site is not well instrumented. 

Furthermore, the two Bayesian approaches considered 
here appear to be less sensitive to biases on target site index 
flood estimation than the index flood estimator. Thus, the 
Bayesian approaches are more readily adaptable which is a 
major advantage as errors on the index flood estimation are 
often uncontrollable. 

As noticed by Ribatet et al. [2007], the index flood model 
does not use information optimally. This point is corrob- 
orated in this study as the model initiated by Dalrymple 
[1960] is not inevitably more accurate as the information 
within the pooling group increases. This is not the case for 
the Bayesian approaches. In addition, they seem to be more 
accurate when dealing with regions with well instrumented 
sites, particularly for large quantiles. 

All statistical analysis were carried out by use of R Devel- 
opment Core Team [2006]. For this purpose, the algorithm 
presented in section 2.2 was incorporated in the evdbayes 



packages [Stephenson and Ribatet, 2006]. The algorithm for 
the generation procedure is available on request from the 
author. 
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Appendix A: The Metropolis-Hastings Algorithm 

In this section, the Metropolis-Hastings algorithm is pre- 
sented. According to the results derived by Green [1995], 
some details will be given to consider the reversible jump 
case. The basic idea of the Metropolis-Hastings algorithm is 
to obtain a Markov chain that converges to a known station- 
ary distribution. The strength of the Metropolis-Hasting 



RIBATET ET AL.: REVERSIBLE JUMP TECHNIQUES IN REGIONAL FLOOD FREQUENCY ANALYSIS 



X- 11 



approach is that the convergence is reached whatever the 
initial state of the Markov chain is and that the distribu- 
tions could be known up to a constant. 

Let / denote the target distribution of interest. Most 
often, in Bayesian inference, ir will be the posterior dis- 
tribution for the parameters. Let q(-,x) be the proposal 
distribution i.e. the proposal states will be sampled from 
this proposal distribution given the current state xt- The 
Metropolis-Hastings algorithm can be summarized as fol- 
lows: 

1. Generate u from a uniform distribution on [0, 1]; 

2. Generate x plop from q(-,x t ) 

OA /(^prop ) | ^prop J 

O. iA c lass <- {(xt) q(Xproplxt) 

4. if u < min (1, A c i ass ) then 

5. %t+l i ^Cprop 

6. else 

7. x t +i <— x t 

8. endif 

9. Go to 1. 

The initial Metropolis-Hastings algorithm can not ac- 
count for dimensional switch. For this purpose, the "jumps" 
between sub-spaces must be defined (see equations (15a)- 
(15c) and (18a)-(18c)) and the quantity A c i ass must be re- 
defined each time a jump is considered. Here, only a simple 
case of the reversible jumps approach is considered (see Sec- 
tion 3.3 of Green [1995]). If only two moves mi(x t ) and 
m,2{xt) can occur with probabilities pi and P2 respectively, 
then the quantity A c i ass must be replaced by A rov . Conse- 
quently, for a proposal move of type mi : 

A rcv = A c l ass — Jl (Al) 
P2 

where Ji is the jacobian of the transformation x t i— > mi(x t ). 
If the proposal move is of type m 2 , then 

P'2 

A rev = A c l a ss 'h (A2) 

Pi 

where J2 is the jacobian of the transformation x t 1— > 7712(2:*). 
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